AITopics | robust bi-tempered logistic loss

Robust Bi-Tempered Logistic Loss Based on Bregman Divergences

Neural Information Processing SystemsDec-25-2025, 16:58:34 GMT

We introduce a temperature into the exponential function and replace the softmax output layer of the neural networks by a high-temperature generalization. Similarly, the logarithm in the loss we use for training is replaced by a low-temperature logarithm. By tuning the two temperatures, we create loss functions that are non-convex already in the single layer case. When replacing the last layer of the neural networks by our bi-temperature generalization of the logistic loss, the training becomes more robust to noise. We visualize the effect of tuning the two temperatures in a simple setting and show the efficacy of our method on large datasets. Our methodology is based on Bregman divergences and is superior to a related two-temperature method that uses the Tsallis divergence.

bregman divergence, name change, robust bi-tempered logistic loss, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.86)

Add feedback

Robust Bi-Tempered Logistic Loss Based on Bregman Divergences

Neural Information Processing SystemsMay-27-2025, 13:12:29 GMT

We introduce a temperature into the exponential function and replace the softmax output layer of the neural networks by a high-temperature generalization. Similarly, the logarithm in the loss we use for training is replaced by a low-temperature logarithm. By tuning the two temperatures, we create loss functions that are non-convex already in the single layer case. When replacing the last layer of the neural networks by our bi-temperature generalization of the logistic loss, the training becomes more robust to noise. We visualize the effect of tuning the two temperatures in a simple setting and show the efficacy of our method on large datasets. Our methodology is based on Bregman divergences and is superior to a related two-temperature method that uses the Tsallis divergence.

bregman divergence, neural network, robust bi-tempered logistic loss, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)

Add feedback

Robust Bi-Tempered Logistic Loss Based on Bregman Divergences

Neural Information Processing SystemsOct-10-2024, 11:44:27 GMT

We introduce a temperature into the exponential function and replace the softmax output layer of the neural networks by a high-temperature generalization. Similarly, the logarithm in the loss we use for training is replaced by a low-temperature logarithm. By tuning the two temperatures, we create loss functions that are non-convex already in the single layer case. When replacing the last layer of the neural networks by our bi-temperature generalization of the logistic loss, the training becomes more robust to noise. We visualize the effect of tuning the two temperatures in a simple setting and show the efficacy of our method on large datasets. Our methodology is based on Bregman divergences and is superior to a related two-temperature method that uses the Tsallis divergence.

bregman divergence, neural network, robust bi-tempered logistic loss, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

Robust Bi-Tempered Logistic Loss Based on Bregman Divergences

Amid, Ehsan, Warmuth, Manfred K. K., Anil, Rohan, Koren, Tomer

Neural Information Processing SystemsMar-19-2020, 02:47:59 GMT

We introduce a temperature into the exponential function and replace the softmax output layer of the neural networks by a high-temperature generalization. Similarly, the logarithm in the loss we use for training is replaced by a low-temperature logarithm. By tuning the two temperatures, we create loss functions that are non-convex already in the single layer case. When replacing the last layer of the neural networks by our bi-temperature generalization of the logistic loss, the training becomes more robust to noise. We visualize the effect of tuning the two temperatures in a simple setting and show the efficacy of our method on large datasets.

bregman divergence, neural network, robust bi-tempered logistic loss, (2 more...)

Neural Information Processing Systems

Genre: Research Report (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)

Add feedback

Robust Bi-Tempered Logistic Loss Based on Bregman Divergences

Amid, Ehsan, Warmuth, Manfred K., Anil, Rohan, Koren, Tomer

arXiv.org Machine LearningJun-7-2019

We introduce a temperature into the exponential function and replace the softmax output layer of neural nets by a high temperature generalization. Similarly, the logarithm in the log loss we use for training is replaced by a low temperature logarithm. By tuning the two temperatures we create loss functions that are non-convex already in the single layer case. When replacing the last layer of the neural nets by our two temperature generalization of logistic regression, the training becomes more robust to noise. We visualize the effect of tuning the two temperatures in a simple setting and show the efficacy of our method on large data sets. Our methodology is based on Bregman divergences and is superior to a related two-temperature method using the Tsallis divergence.

artificial intelligence, divergence, machine learning, (16 more...)

arXiv.org Machine Learning

1906.03361

Country: North America > United States (0.68)

Genre: Research Report > New Finding (0.86)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback